Document Decomposition into Geometric and Logical Layout
نویسندگان
چکیده
We present an Android application for scanning LaTeX documents to determine the logical layout of the document. The algorithm first prepares the image for processing, then determines where text and figures are within a document, and finally classifies these various components of a document. Chapter
منابع مشابه
Geometric Layout Analysis Techniques for Document Image Understanding: a Review
Document Image Understanding (DIU) is an interesting research area with a large variety of challenging applications. Researchers have worked from decades on this topic, as witnessed by the scientific literature. The main purpose of the present report is to describe the current status of DIU with particular attention to two subprocesses: document skew angle estimation and page decomposition. Sev...
متن کاملAn integrated approach to document decomposition and structural analysis
A document image is a visual representation of a paper document, such as a journal article page, a cover page of facsimile transmission, ooce correspondence, an application form, etc. Document image understanding as a research endeavor consists of developing processes for taking a document through various representations: from scanned image to semantic representation. This paper describes docum...
متن کاملDocument image analysis with cooperative interaction between layout analysis and logical structure analysis
When a printed document is to be input to a computer system, the document must be converted to a computer-readable format, e.g., ASCII, PDF, RTF, CSV, or SGML/XML/HTML-tagged data. In order to obtain these data formats from a printed document, it is necessary to extract from the printed document as much information as possible, i.e., layout structure (layout objects and their hierarchical relat...
متن کاملLogical structure detection for heterogeneous document classes
We present a fully implemented system based on generic document knowledge for detecting the logical structure of documents for which only general layout information is assumed. In particular, we focus on detecting the reading order. Our system integrates components based on computer vision, artificial intelligence, and natural language processing techniques. The prominent feature of our framewo...
متن کاملBasedon Adaptive Split - and - Merge andQualitative Spatial Reasoning
The ultimate goal of automatic document processing is to understand the semantics of a document. Towards such an end, one of the primary enabling steps has been to rst reason about the layout of the document by means of page segmentation and segment spatial reasoning or labeling. This, in turn, allows for the derivation of document logical organization. This paper describes a generic document s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015